Over 229 million cases and 409,000 fatalities were expected to be attributed to malaria in 2020, even though it is both preventable and treatable.
Analyzing blood films with a microscope, the standard approach for detecting malaria, is labor-intensive, time consuming and requires expert knowledge. Data exploration revealed a distinctive feature found in infected cells - a purpe discoloration. This project proposes a Convolutional Neural Network (CNN) model for malaria diagnosis employing microscopic cell images. Six different iterations of the model were created and tested, with a sigmoid activation model proving to be the most successful due to its high accuracy (98.3%) and low number of false negatives.
Computational infrastructure, data collecting and labeling for model improvement, integration with existing healthcare systems, and user training are all costly endeavors that must be undertaken before widespread adoption can occur. The application saves time and money compared to traditional methods of diagnosing malaria using blood films, and it improves diagnostic accuracy. The possible risks are a lack of privacy and security, noncompliance with regulations, poor data quality and insufficient infrastructure.
Future research should focus on refining the model through transfer learning, expanding disease detection capabilities with multiclass classification models, and keeping a close eye on the model's performance.
Malaria is a potentially fatal illness that is caused by the bites of female Anopheles mosquitoes carrying harmful parasites. Despite being preventable and treatable, it continues to be a serious worldwide health issue. In 2020, the World Health Organization predicts that over half of humanity will be at danger of contracting malaria, which will cause an estimated 229 million cases and 409,000 fatalities worldwide. Particularly in Sub-Saharan Africa, where over two-thirds of malaria deaths each year are caused by children under five, the disease mostly affects the most vulnerable and impoverished people.
To lessen the severity of the illness and avoid fatalities, malaria treatment and early detection are essential. However, access to diagnostic procedures and medical care is constrained in many areas where malaria is widespread. In order to avoid overusing antimalarial medications, which can result in drug resistance, accurate and prompt identification of malaria is also essential. Examining blood films under a microscope is the classic approach for diagnosing malaria. Accurately identifying and counting malaria parasites demands a great deal of knowledge. Additionally, it may take a while, which could cause a delay in both diagnosis and therapy.
Therefore, the development of effective, reliable, and easily usable malaria detection techniques is urgently needed. A model for malaria diagnosis using microscopic pictures could be a game-changing answer by utilizing developments in machine learning and image processing. In addition to lightening the load on medical staff, it would speed up diagnosis and result in timely and effective treatment. As a result, there is a chance that this will drastically lower malaria morbidity and mortality.
Such a solution might offer significant socioeconomic advantages in addition to the immediate health effects. In areas where malaria is a serious problem, it might increase productivity and economic growth as healthier communities are better able to support their local economy. A successful machine learning model may also be used as a model for combating other infectious illnesses, opening up fresh possibilities in the struggle against dangers to world health.
The proposed solution employs a Convolutional Neural Network (CNN) model for image classification, which is well suited for this issue due to its demonstrated performance in image recognition tasks and its capacity to preserve the spatial relationships between pixels, an important consideration when identifying distinctive features within microscopic images of cells.
In total, 6 CNN models were designed with varying specifications. As seen in Figure 1, Image inputs were also manipulated to verify if they would have any impact on the accuracy of the models by making the distinctive features of each type of cell more detectable. From the image dataset, it was clear that the infected cells had a distinctive purple discoloration that needed to be understood by the model.
The metrics of success used to determine the best model were a combination of accuracy ranking and an analysis of the types of classification mistakes made by the models (minimizing false negatives). In this problem statement, it is important that the models are accurate when classifying cells as infected and non-infected. It is also highly important that the model minimizes the instances where it incorrectly diagnoses an infected cell as healthy (referred to as a false negative) to avoid scenarios where infected patients are left undiagnosed and untreated.
Figure 2 shows the performance of the various models compared to each other. The highlighted model is the proposed solution which boasts an impressive accuracy of 98.3% while also having only 12 instances of false negatives from a test set of 2600 images. Despite model2 having an accuracy that matches model1’s, there are more instances of model2 classifying an infected cell as healthy and hence model1 is the preferable solution for this problem statement. Figure 2 also highlights that there was a negative impact on accuracy when the images were manipulated (model3 and model_hsv).
Model1 uses the original images and treats the problem as a binary classification problem, meaning that the output can only be one of two things (hence the sigmoid activation in the output layer).
This ML model can greatly contribute to the diagnostic process, allowing for timely and proper treatment of patients, and possibly saving lives, by providing a robust, accurate, and rapid solution for malaria detection. The disease can also help reduce the stress on medical staff in areas where malaria is prevalent. This model's portability across a wide range of computing platforms and speed in handling massive volumes of tests means it can be used to offer accurate diagnoses to previously unreached places.
Deep learning models require significant computational resources. Implementing this model will require setting up an infrastructure that makes these resources available in a cost-effective and scalable way so that it can have widespread use. Furthermore, integration into existing healthcare information systems will require the assistance of healthcare IT experts but would ultimately assist in seamless adoption and use by healthcare professionals. Future-proofing this model will require maintenance as new image data over time can impact the accuracy of detection. Drug-resistant parasites may evolve over time to infect healthy cells in new manners and therefore may produce other distinctive features that need to be taught to the model. User training on how to input data in a standardized format will also assist in the maintenance and use of the model. In summary, stakeholders are required to invest in the necessary computational infrastructure, continue to collect and label images for model improvement, work with healthcare IT experts for integration into existing systems and train professionals on how to use the system.
Hosting this model on AWS would cost around 10,000 USD annually per 100,000,000 predictions (0.10 USD per 1000 predictions according to Amazon ML quotes). The cost of integration into existing systems would be the labor cost of software engineers or a consulting firm and would be dependent on the complexity of the existing systems. User training could be conducted using instructional videos with online distribution which could be made the responsibility of the integration team as well. Maintaining and improving the model would require a data scientist or a machine learning engineer and so the cost would the yearly labor cost of one professional (~ $120,000).
The benefit of implementing this model would be a faster, more accurate, and more cost-effective diagnosis of malaria. The labor cost and time required for lab technicians to conduct manual tests of blood films would be greatly reduced with an increase in the accuracy of correct diagnosis.
Ensuring patient health data privacy is a critical risk to this solution. Mishandling of patient data due to insecure protocols can have legal repercussions as well as personal repercussions for those patients involved. Robust security measures must be upheld to maintain the integrity of this data.
Data quality must also be monitored as the performance of the model is highly dependent on the data that it is trained on. Inaccurate data can negatively impact the model’s performance over time leading to misdiagnosis. User training and additional monitoring is required to ensure the data is of good quality.
For low-resource locations where computing resources or internet connectivity is limited, the implementation of this model may be more technically challenging. Local solutions would have to be available for adoption in such settings and may cost more than the cloud alternative.
The most immediate consideration is to attempt to improve the model through transfer learning. There are existing models that have been trained on many more cell images than the proposed solution. These pre-trained models can be used to reduce the training time and data requirements for improved accuracy. Furthermore, the model can be extended to identify more than one type of disease. Multiclass classification models could be developed to assist with more types of diagnosis and would also help with the adoption of the model as its value to healthcare professionals increases from one domain to many domains. Lastly, performance monitoring is key to future proofing this model. Analysis of its performance over time can prevent accuracy drops or other functional issues.
Amazon (2023) Getting started, Amazon. Available at: https://aws.amazon.com/getting-started/projects/build-machine-learning-model/services-costs/#:~:text=The%20monthly%20price%20for%20Amazon,%2F1000)%20*%20890%2C000).
World Health Organization, 2022. World malaria report 2022. World Health Organization.
The context: Why is this problem important to solve?
Malaria is a life-threatening disease caused by parasites that are transmitted to people through the bites of infected female mosquitoes.More accurate detection of infected cells can ensure that patients who are suffering from malaria receive the medical attention they require. In scenarios where malaria is in its early stages, it may be harder for the human eye to tell from glance whether a cell is infected or not. ML models may be able to have a higher predictive accuracy and therefore ensure that patients are not misdiagnosed. Developing a model for quick and efficient detection of malaria parasites in cells can be a significant step towards combating this disease.
The objectives: What is the intended goal?
Develop a ML model that has a high predictive accuracy of whether a cell image is of an infected or uninfected cell with as few false negatives as possible. This could potentially assist healthcare providers in diagnosing the disease, reducing the time and increasing the accuracy of diagnosis, and helping to provide appropriate treatment more quickly.
The key questions: What are the key questions that need to be answered?
What are the distinctive features separating the cells? What kind of classification is required? What kind of architecture can best predict infected vs uninfected cells accurately? What kind of preprocessing is required to improve the accuracy of the models? Can the developed model generalize well for unseen data?
The problem formulation: What is it that we are trying to solve using data science?
We are trying to solve binary classification problem where the two classes are "parasitized" and "uninfected". From input of a dataset of images of these two types of cells, we want to train a model to learn from this data set and make accurate predictions on unseen data. We want the model to be as accurate as possible while reducing false positives and especially false negatives.
There are a total of 24,958 train and 2,600 test images (colored) that we have taken from microscopic images. These images are of the following categories:
Parasitized: The parasitized cells contain the Plasmodium parasite which causes malaria
Uninfected: The uninfected cells are free of the Plasmodium parasites
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import zipfile
import os
from PIL import Image
import warnings
warnings.filterwarnings('ignore')
path = '/content/drive/MyDrive/Capstone Project/cell_images.zip'
# Zip extraction
with zipfile.ZipFile(path, 'r') as zipExtract:
zipExtract.extractall()
The extracted folder has different folders for train and test data contains the different sizes of images for parasitized and uninfected cells within the respective folder name.
The size of all images must be the same and should be converted to 4D arrays so that they can be used as an input for the convolutional neural network. Also, we need to create the labels for both types of images to be able to train and test the model.
Let's do the same for the training data first and then we will use the same code for the test data as well.
# Function to load, resize, and label images
def process_images(data_dir, SIZE=64):
image_data = []
labels = []
# Dictionary for label mapping
label_mapping = {'/parasitized/': 1, '/uninfected/': 0}
# Iterate through each folder in the directory
for folder_name in ['/parasitized/', '/uninfected/']:
# Get list of all files in the folder
image_files = os.listdir(data_dir + folder_name)
# Iterate through each image in the folder
for image_name in image_files:
try:
# Load and resize the image
image = Image.open(data_dir + folder_name + image_name)
image = image.resize((SIZE, SIZE))
# Convert the image to array and append to list
image_data.append(np.array(image))
# Append the label
labels.append(label_mapping[folder_name])
except Exception as e:
print("Error:" + {image_name} + " " + {e})
return np.array(image_data), np.array(labels)
#Storing the path of the extracted "train" and "test" folders
train_dir = '/content/cell_images/train'
test_dir = '/content/cell_images/test'
#Process the training and testing data
train_images, train_labels = process_images(train_dir)
test_images, test_labels = process_images(test_dir)
train_images = np.array(train_images)
train_labels = np.array(train_labels)
test_images = np.array(test_images)
test_labels = np.array(test_labels)
print("Shape of train images: ", train_images.shape)
print("Shape of test images: ", test_images.shape)
Shape of train images: (24958, 64, 64, 3) Shape of test images: (2600, 64, 64, 3)
print("Shape of train labels: ", train_labels.shape)
print("Shape of test labels: ", test_labels.shape)
Shape of train labels: (24958,) Shape of test labels: (2600,)
There are 24,958 images in the training dataset and 2600 images in the test dataset.
Images Dimensions - 64x64x3 (height x width x colours(RGB so its 3))
Labels Dimensions - 1D array filled with 1/0 depending on whether the image represents parisitized or uninfected cells.
print("Train images:")
print("Min pixel value: ", np.min(train_images))
print("Max pixel value: ", np.max(train_images))
print("Test images:")
print("Min pixel value: ", np.min(test_images))
print("Max pixel value: ", np.max(test_images))
Train images: Min pixel value: 0 Max pixel value: 255 Test images: Min pixel value: 0 Max pixel value: 255
The images are in a standard format where 0 represents black, 255 represents white, and values in between represent varying shades of colors.
The range is consistent across both train and test datasets, which implies that the train and test sets have similar characteristics.
#In both label arrays, the label "1" respresents parasitized cells and the label "0" represents uninfected cells, so we can sum instances of each label within the array to count them.
num_parasitized_train = np.sum(train_labels == 1)
num_uninfected_train = np.sum(train_labels == 0)
num_parasitized_test = np.sum(test_labels == 1)
num_uninfected_test = np.sum(test_labels == 0)
print("Training set: " + str(num_parasitized_train) + " parasitized, " + str(num_uninfected_train) + " uninfected")
print("Test set: " + str(num_parasitized_test) + " parasitized, " + str(num_uninfected_test) + " uninfected")
Training set: 12582 parasitized, 12376 uninfected Test set: 1300 parasitized, 1300 uninfected
train_images = train_images / 255.0
test_images = test_images / 255.0
All pixel values will now be in the range [0,1]. Reducing the scale of pixel values will help the model to learn and converge faster.
# Create a list with our labels
labels = ['Uninfected', 'Parasitized']
# Count the occurrences of each class in the train set
counts_train = [np.sum(train_labels == 0), np.sum(train_labels == 1)]
# Count the occurrences of each class in the test set
counts_test = [np.sum(test_labels == 0), np.sum(test_labels == 1)]
# Create a figure and a set of subplots
fig, ax = plt.subplots()
# Define bar width
bar_width = 0.35
# Create bar plot for training data
rects1 = ax.bar(np.arange(len(labels)), counts_train, bar_width, label='Train')
# Create bar plot for test data
rects2 = ax.bar(np.arange(len(labels)) + bar_width, counts_test, bar_width, label='Test')
# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_xlabel('Classes')
ax.set_ylabel('Count')
ax.set_title('Counts by class and dataset')
ax.set_xticks(np.arange(len(labels)) + bar_width / 2)
ax.set_xticklabels(labels)
ax.legend()
# Display the plot
plt.show()
The size of the bars are approximately equal. This means there is roughly the same number of instances of each class (parasitized and uninfected) in the dataset and indicates that the dataset is balanced.
Let's visualize the images from the train data
#Displaying 20 random images from the train dataset alongside the labels
plt.figure(figsize=(10,10))
for x in range(20):
plt.subplot(5,5,x+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
#random integer being chosen so that both parasitized and uninfected cells can be seen
i = int(np.random.randint(0, train_images.shape[0], 1))
plt.imshow(train_images[i])
if train_labels[i]:
plt.xlabel("Parasitized")
else:
plt.xlabel("Uninfected")
plt.show()
Parastized cells have a distinct discolouration of pink/purple in their cells compared to colour scheme of their surroundings wheras uninfected cells have relatively uniform colours throughout the cells.
# Displaying 36 random images from the train dataset alongside the labels
plt.figure(figsize=(12,12))
for x in range(36): # Changed the number of iterations to 36
plt.subplot(6,6,x+1) # Changed to a 6x6 grid
plt.xticks([])
plt.yticks([])
plt.grid(False)
# random integer being chosen so that both parasitized and uninfected cells can be seen
i = int(np.random.randint(0, train_images.shape[0], 1))
plt.imshow(train_images[i])
if train_labels[i]:
plt.xlabel("Parasitized")
else:
plt.xlabel("Uninfected")
plt.show()
As above, parastized cells have a distinct discolouration of pink/purple in their cells compared to colour scheme of their surroundings wheras uninfected cells have relatively uniform colours throughout the cells.
#Separating the images based on the labels
parasitized_images = train_images[train_labels == 1]
uninfected_images = train_images[train_labels == 0]
#Calculating mean images
mean_parasitized = np.mean(parasitized_images, axis=0)
mean_uninfected = np.mean(uninfected_images, axis=0)
Mean image for parasitized
plt.figure(figsize=(10,5))
plt.subplot()
plt.imshow(mean_parasitized)
plt.title('Mean Parasitized')
plt.show()
Mean image for uninfected
plt.figure(figsize=(10,5))
plt.subplot()
plt.imshow(mean_uninfected)
plt.title('Mean Uninfected')
plt.show()
The mean image of the parasitized cell is a slightly darker hue of pink than the infected cell. The difference suggests that there are distinctive visual features that might be useful for the ML model to learn to make accurate predictions. It also indicates that the mean image alone could potentially differentiate between parasitized and uninfected cells. The mean image may have good predictive power when used as inputs to an ML model for malaria detection.
import cv2
# Converting train images to HSV
train_images_hsv = []
for image in train_images:
# Converting image to float32 datatype
image = image.astype(np.float32)
# Converting image from RGB to HSV
image_hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
train_images_hsv.append(image_hsv)
# Converting the list to a numpy array
train_images_hsv = np.array(train_images_hsv)
#Visualizing the images after conversion
# Generating 3 random indices
random_indices = np.random.choice(len(train_images_hsv), size=3, replace=False)
# Creating 3 subplots
fig, axs = plt.subplots(1, 3, figsize=(12, 4))
# Iterating over the subplots and randomly selected indices
for i, ax in zip(random_indices, axs):
#Finding labels by comparing to train labels
if train_labels[i] == 1:
img_label = "Parasitized"
else:
img_label = "Uninfected"
# Display the image in HSV format
ax.imshow(train_images_hsv[i])
ax.set_title("Index:" + str(i) + " " + img_label)
ax.axis("off")
plt.show()
WARNING:matplotlib.image:Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). WARNING:matplotlib.image:Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). WARNING:matplotlib.image:Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
# Converting test images to HSV
test_images_hsv = []
for image in test_images:
# Converting image to float32 datatype
image = image.astype(np.float32)
# Converting image from RGB to HSV
image_hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
test_images_hsv.append(image_hsv)
# Converting the list to a numpy array
test_images_hsv = np.array(test_images_hsv)
#Visualizing the images after conversion
# Generating 3 random indices
random_indices = np.random.choice(len(test_images_hsv), size=3, replace=False)
# Creating 3 subplots
fig, axs = plt.subplots(1, 3, figsize=(12, 4))
# Iterating over the subplots and randomly selected indices
for i, ax in zip(random_indices, axs):
#Finding labels by comparing to test labels
if test_labels[i] == 1:
img_label = "Parasitized"
else:
img_label = "Uninfected"
# Displaying the image in HSV format
ax.imshow(test_images_hsv[i])
ax.set_title("Index:" + str(i) + " " + img_label)
ax.axis("off")
plt.show()
WARNING:matplotlib.image:Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). WARNING:matplotlib.image:Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). WARNING:matplotlib.image:Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
By converting to HSV, the color information is separated into distinct channels: hue, saturation, and value which facilitates color-based analysis, such as color segmentation or object detection. The above image are an example of how there is a clearer distinction now between uninfected cells and parastisized cells.
import cv2
# Apply Gaussian blurring to train images
train_images_blurred = []
for image in train_images:
# Converting image to float32 datatype
image = image.astype(np.float32)
# Applying Gaussian blur with a kernel size of (5, 5) and sigma of 0
blurred_image = cv2.GaussianBlur(image, (5, 5),0)
train_images_blurred.append(blurred_image)
# Convert the list to a numpy array
train_images_blurred = np.array(train_images_blurred)
#Visualizing the images after conversion
# Generating 3 random indices
random_indices = np.random.choice(len(train_images_blurred), size=3, replace=False)
# Creating 3 subplots
fig, axs = plt.subplots(1, 3, figsize=(12, 4))
# Iterating over the subplots and randomly selected indices
for i, ax in zip(random_indices, axs):
#Finding labels by comparing to train labels
if train_labels[i] == 1:
img_label = "Parasitized"
else:
img_label = "Uninfected"
# Display the image in HSV format
ax.imshow(train_images_blurred[i])
ax.set_title("Index:" + str(i) + " " + img_label)
ax.axis("off")
plt.show()
# Apply Gaussian blurring to train images
test_images_blurred = []
for image in train_images:
# Converting image to float32 datatype
image = image.astype(np.float32)
# Applying Gaussian blur with a kernel size of (5, 5) and sigma of 0
blurred_image = cv2.GaussianBlur(image, (5, 5),0)
test_images_blurred.append(blurred_image)
# Convert the list to a numpy array
test_images_blurred = np.array(test_images_blurred)
#Visualizing the images after conversion
# Generating 3 random indices
random_indices = np.random.choice(len(test_images_blurred), size=3, replace=False)
# Creating 3 subplots
fig, axs = plt.subplots(1, 3, figsize=(12, 4))
# Iterating over the subplots and randomly selected indices
for i, ax in zip(random_indices, axs):
#Finding labels by comparing to train labels
if train_labels[i] == 1:
img_label = "Parasitized"
else:
img_label = "Uninfected"
# Display the image in HSV format
ax.imshow(test_images_blurred[i])
ax.set_title("Index:" + str(i) + " " + img_label)
ax.axis("off")
plt.show()
The dataset doesn't have a lot of noise to begin with and so the gaussian blurring approach, in an attempt to remove noise from the images, may make it harder for the ML models to differentiate distintive characteristics between the two types of cells.
Instead, we can attempt data augmentation to improve the ML's ability to detect the differentiative characteristics regardless of orientation and therefore improve the robustness of the system.
We could also try transfer learning by using a model that has been pretrained on a larger set of cell images. This could improve classification accuracy
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras import backend
import random
#Clearing the backend session
backend.clear_session()
# Setting the random seed to standardize outputs
np.random.seed(25)
random.seed(25)
tf.random.set_seed(25)
train_labels = to_categorical(train_labels, 2)
test_labels = to_categorical(test_labels, 2)
#Define the model
model = Sequential()
#Add convolutional layers
model.add(Conv2D(filters = 32, kernel_size = 2, padding = "same", activation='relu', input_shape=(64, 64, 3)))
model.add(MaxPooling2D(pool_size = 2))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 32, kernel_size = 2, padding = "same", activation='relu'))
model.add(MaxPooling2D(pool_size = 2))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 32, kernel_size = 2, padding = "same", activation='relu'))
model.add(MaxPooling2D(pool_size = 2))
model.add(Dropout(0.2))
#Flatten to convert 2D features to 1D vector for fully connect layers
model.add(Flatten())
#Add dense layers
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
# Print the model summary
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 32) 416
max_pooling2d (MaxPooling2D (None, 32, 32, 32) 0
)
dropout (Dropout) (None, 32, 32, 32) 0
conv2d_1 (Conv2D) (None, 32, 32, 32) 4128
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
dropout_1 (Dropout) (None, 16, 16, 32) 0
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
dropout_2 (Dropout) (None, 8, 8, 32) 0
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 128) 262272
dropout_3 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 2) 258
=================================================================
Total params: 271,202
Trainable params: 271,202
Non-trainable params: 0
_________________________________________________________________
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Using Callbacks
callbacks = [
EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath='.mdl_wts.hdf5', monitor='val_loss', save_best_only=True)
]
Fit and train our Model
history = model.fit(train_images, train_labels, validation_data=(test_images, test_labels), callbacks=callbacks, epochs=10)
Epoch 1/10 780/780 [==============================] - 134s 169ms/step - loss: 0.4578 - accuracy: 0.7667 - val_loss: 0.1766 - val_accuracy: 0.9381 Epoch 2/10 780/780 [==============================] - 130s 167ms/step - loss: 0.1469 - accuracy: 0.9519 - val_loss: 0.1597 - val_accuracy: 0.9431 Epoch 3/10 780/780 [==============================] - 129s 165ms/step - loss: 0.1208 - accuracy: 0.9651 - val_loss: 0.1172 - val_accuracy: 0.9665 Epoch 4/10 780/780 [==============================] - 139s 178ms/step - loss: 0.1033 - accuracy: 0.9703 - val_loss: 0.1096 - val_accuracy: 0.9669 Epoch 5/10 780/780 [==============================] - 134s 172ms/step - loss: 0.0915 - accuracy: 0.9731 - val_loss: 0.0855 - val_accuracy: 0.9777 Epoch 6/10 780/780 [==============================] - 131s 168ms/step - loss: 0.0867 - accuracy: 0.9741 - val_loss: 0.0778 - val_accuracy: 0.9812 Epoch 7/10 780/780 [==============================] - 139s 178ms/step - loss: 0.0770 - accuracy: 0.9763 - val_loss: 0.0660 - val_accuracy: 0.9842 Epoch 8/10 780/780 [==============================] - 133s 171ms/step - loss: 0.0736 - accuracy: 0.9764 - val_loss: 0.0598 - val_accuracy: 0.9812 Epoch 9/10 780/780 [==============================] - 128s 165ms/step - loss: 0.0720 - accuracy: 0.9766 - val_loss: 0.0663 - val_accuracy: 0.9804 Epoch 10/10 780/780 [==============================] - 133s 171ms/step - loss: 0.0680 - accuracy: 0.9784 - val_loss: 0.0581 - val_accuracy: 0.9754
test_accuracy = model.evaluate(test_images, test_labels)
print("Test Accuracy:", test_accuracy)
82/82 [==============================] - 5s 58ms/step - loss: 0.0581 - accuracy: 0.9754 Test Accuracy: [0.05805911868810654, 0.9753845930099487]
Plotting the confusion matrix
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
#Predicting the classes for the test images
y_pred = model.predict(test_images)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true_classes = np.argmax(test_labels, axis=1)
# Printing the classification report will be useful too
print(classification_report(y_true_classes, y_pred_classes))
82/82 [==============================] - 3s 35ms/step
precision recall f1-score support
0 0.96 0.99 0.98 1300
1 0.99 0.96 0.97 1300
accuracy 0.98 2600
macro avg 0.98 0.98 0.98 2600
weighted avg 0.98 0.98 0.98 2600
cm = confusion_matrix(y_true_classes, y_pred_classes)
#Plot the confusion matrix using a heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.title("Confusion Matrix")
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.show()
Plotting the train and validation curves
#Plotting the train and validation accuracy
plt.figure(figsize=(8, 6))
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
So now let's try to build another model with few more add on layers and try to check if we can try to improve the model. Let's build a model by adding few layers if required and altering the activation functions.
#Clearing the backend session
backend.clear_session()
# Setting the random seed to standardize outputs
np.random.seed(25)
random.seed(25)
tf.random.set_seed(25)
# Define the model
model1 = Sequential()
# Add convolutional layers, changed activation functions to tanh
model1.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh", input_shape=(64, 64, 3)))
model1.add(MaxPooling2D(pool_size=2))
model1.add(Dropout(0.2))
model1.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh"))
model1.add(MaxPooling2D(pool_size=2))
model1.add(Dropout(0.2))
model1.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh"))
model1.add(MaxPooling2D(pool_size=2))
model1.add(Dropout(0.2))
#New layers
model1.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh"))
model1.add(MaxPooling2D(pool_size=2))
model1.add(Dropout(0.2))
# Flatten to convert 2D features to 1D vector for fully connected layers
model1.add(Flatten())
# Add dense layer
model1.add(Dense(128, activation="tanh"))
model1.add(Dropout(0.5))
#Changed the output activation to sigmoid as this is a binary classification problem (infected or not infected)
#Labels for this model are not One hot encoded as maintaining the binary 1D array is necessary for an output layer of 1
model1.add(Dense(1, activation="sigmoid"))
# Print the model summary
model1.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 32) 416
max_pooling2d (MaxPooling2D (None, 32, 32, 32) 0
)
dropout (Dropout) (None, 32, 32, 32) 0
conv2d_1 (Conv2D) (None, 32, 32, 32) 4128
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
dropout_1 (Dropout) (None, 16, 16, 32) 0
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
dropout_2 (Dropout) (None, 8, 8, 32) 0
conv2d_3 (Conv2D) (None, 8, 8, 32) 4128
max_pooling2d_3 (MaxPooling (None, 4, 4, 32) 0
2D)
dropout_3 (Dropout) (None, 4, 4, 32) 0
flatten (Flatten) (None, 512) 0
dense (Dense) (None, 128) 65664
dropout_4 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 1) 129
=================================================================
Total params: 78,593
Trainable params: 78,593
Non-trainable params: 0
_________________________________________________________________
model1.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Using Callbacks
callbacks = [
EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath='.mdl_wts.hdf5', monitor='val_loss', save_best_only=True)
]
Fit and Train the model
history1 = model1.fit(train_images, train_labels, validation_data=(test_images, test_labels), callbacks=callbacks, epochs=10)
Epoch 1/10 780/780 [==============================] - 148s 188ms/step - loss: 0.4278 - accuracy: 0.7840 - val_loss: 0.5854 - val_accuracy: 0.8058 Epoch 2/10 780/780 [==============================] - 141s 181ms/step - loss: 0.1592 - accuracy: 0.9437 - val_loss: 0.2362 - val_accuracy: 0.9077 Epoch 3/10 780/780 [==============================] - 141s 181ms/step - loss: 0.1259 - accuracy: 0.9576 - val_loss: 0.1576 - val_accuracy: 0.9473 Epoch 4/10 780/780 [==============================] - 141s 181ms/step - loss: 0.1092 - accuracy: 0.9636 - val_loss: 0.1092 - val_accuracy: 0.9573 Epoch 5/10 780/780 [==============================] - 139s 178ms/step - loss: 0.1002 - accuracy: 0.9664 - val_loss: 0.0842 - val_accuracy: 0.9746 Epoch 6/10 780/780 [==============================] - 141s 180ms/step - loss: 0.0957 - accuracy: 0.9687 - val_loss: 0.0851 - val_accuracy: 0.9738 Epoch 7/10 780/780 [==============================] - 146s 187ms/step - loss: 0.0924 - accuracy: 0.9707 - val_loss: 0.0749 - val_accuracy: 0.9773 Epoch 8/10 780/780 [==============================] - 139s 178ms/step - loss: 0.0868 - accuracy: 0.9715 - val_loss: 0.0676 - val_accuracy: 0.9823 Epoch 9/10 780/780 [==============================] - 141s 181ms/step - loss: 0.0892 - accuracy: 0.9717 - val_loss: 0.0625 - val_accuracy: 0.9792 Epoch 10/10 780/780 [==============================] - 141s 181ms/step - loss: 0.0833 - accuracy: 0.9733 - val_loss: 0.0585 - val_accuracy: 0.9835
test_accuracy1 = model1.evaluate(test_images, test_labels)
print("Test Accuracy:", test_accuracy1)
82/82 [==============================] - 3s 39ms/step - loss: 0.0585 - accuracy: 0.9835 Test Accuracy: [0.05848436802625656, 0.9834615588188171]
Plotting the confusion matrix
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
#Predicting the classes for the test images
y_pred_classes = (model1.predict(test_images) > 0.5).astype(int)
# Printing the classification report will be useful too
print(classification_report(test_labels, y_pred_classes))
82/82 [==============================] - 5s 55ms/step
precision recall f1-score support
0 0.99 0.98 0.98 1300
1 0.98 0.99 0.98 1300
accuracy 0.98 2600
macro avg 0.98 0.98 0.98 2600
weighted avg 0.98 0.98 0.98 2600
cm = confusion_matrix(test_labels, y_pred_classes)
#Plot the confusion matrix using a heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.title("Confusion Matrix")
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.show()
Plotting the train and the validation curves
#Plotting the train and validation accuracy
plt.figure(figsize=(8, 6))
plt.plot(history1.history['accuracy'], label='Train Accuracy')
plt.plot(history1.history['val_accuracy'], label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
The accuracy of the model has not improved. The computational cost of using the tanh activation is higher than the relu activation so its worth reverting back. Using a sigmoid function activation in the output layer has also not impacted the accuracy of the model but maintining this may be an appropriate decision given the problem statement and how it is a binary classification problem.
Let us try to build a model using BatchNormalization and using LeakyRelu as our activation function.
#Clearing the backend session
backend.clear_session()
# Setting the random seed to standardize outputs
np.random.seed(25)
random.seed(25)
tf.random.set_seed(25)
from keras.layers import LeakyReLU, BatchNormalization
#Define the model
model2 = Sequential()
#Add convolutional layers, changed activation functions to LeakyRelu and added Batch Normalization layers
model2.add(Conv2D(32, kernel_size=2, padding='same', input_shape=(64, 64, 3)))
model2.add(LeakyReLU(alpha=0.01))
model2.add(BatchNormalization())
model2.add(MaxPooling2D((2, 2)))
model2.add(Conv2D(32, kernel_size=2, padding='same'))
model2.add(LeakyReLU(alpha=0.01))
model2.add(BatchNormalization())
model2.add(MaxPooling2D((2, 2)))
model2.add(Conv2D(32, kernel_size=2, padding='same'))
model2.add(LeakyReLU(alpha=0.01))
model2.add(BatchNormalization())
model2.add(MaxPooling2D((2, 2)))
model2.add(Conv2D(32, kernel_size=2, padding='same'))
model2.add(LeakyReLU(alpha=0.01))
model2.add(BatchNormalization())
model2.add(MaxPooling2D((2, 2)))
#Flatten to convert 2D features to 1D vector for fully connected layers
model2.add(Flatten())
model2.add(Dense(128))
model2.add(LeakyReLU(alpha=0.01))
model2.add(BatchNormalization())
model2.add(Dropout(0.5))
#Changed the output activation to sigmoid as this is a binary classification problem (infected or not infected)
#Labels for this model are not One hot encoded as maintaining the binary 1D array is necessary for an output layer of 1
model2.add(Dense(1, activation='sigmoid'))
#Print the model summary
model2.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 32) 416
leaky_re_lu (LeakyReLU) (None, 64, 64, 32) 0
batch_normalization (BatchN (None, 64, 64, 32) 128
ormalization)
max_pooling2d (MaxPooling2D (None, 32, 32, 32) 0
)
conv2d_1 (Conv2D) (None, 32, 32, 32) 4128
leaky_re_lu_1 (LeakyReLU) (None, 32, 32, 32) 0
batch_normalization_1 (Batc (None, 32, 32, 32) 128
hNormalization)
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
leaky_re_lu_2 (LeakyReLU) (None, 16, 16, 32) 0
batch_normalization_2 (Batc (None, 16, 16, 32) 128
hNormalization)
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
conv2d_3 (Conv2D) (None, 8, 8, 32) 4128
leaky_re_lu_3 (LeakyReLU) (None, 8, 8, 32) 0
batch_normalization_3 (Batc (None, 8, 8, 32) 128
hNormalization)
max_pooling2d_3 (MaxPooling (None, 4, 4, 32) 0
2D)
flatten (Flatten) (None, 512) 0
dense (Dense) (None, 128) 65664
leaky_re_lu_4 (LeakyReLU) (None, 128) 0
batch_normalization_4 (Batc (None, 128) 512
hNormalization)
dropout (Dropout) (None, 128) 0
dense_1 (Dense) (None, 1) 129
=================================================================
Total params: 79,617
Trainable params: 79,105
Non-trainable params: 512
_________________________________________________________________
model2.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Using callbacks
callbacks = [
EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath='.mdl_wts.hdf5', monitor='val_loss', save_best_only=True)
]
Fit and train the model
history2 = model2.fit(train_images, train_labels, validation_data=(test_images, test_labels), callbacks=callbacks, epochs=10)
Epoch 1/10 780/780 [==============================] - 196s 237ms/step - loss: 0.1825 - accuracy: 0.9286 - val_loss: 0.0782 - val_accuracy: 0.9712 Epoch 2/10 780/780 [==============================] - 183s 235ms/step - loss: 0.0864 - accuracy: 0.9715 - val_loss: 0.0668 - val_accuracy: 0.9808 Epoch 3/10 780/780 [==============================] - 181s 232ms/step - loss: 0.0784 - accuracy: 0.9730 - val_loss: 0.0513 - val_accuracy: 0.9862 Epoch 4/10 780/780 [==============================] - 184s 236ms/step - loss: 0.0682 - accuracy: 0.9770 - val_loss: 0.1631 - val_accuracy: 0.9600 Epoch 5/10 780/780 [==============================] - 182s 233ms/step - loss: 0.0663 - accuracy: 0.9778 - val_loss: 0.0881 - val_accuracy: 0.9762
Plotting the train and validation accuracy
#Plotting the train and validation accuracy
plt.figure(figsize=(8, 6))
plt.plot(history2.history['accuracy'], label='Train Accuracy')
plt.plot(history2.history['val_accuracy'], label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
The model is learning and generalizing well initially but begins to overfit after the second epoch, as the validation loss began increasing after that. The early stopping callback stopped training after two epochs of increasing validation loss which helped prevent further overfitting.
test_accuracy2 = model2.evaluate(test_images, test_labels)
print("Test Accuracy:", test_accuracy2)
82/82 [==============================] - 5s 59ms/step - loss: 0.0881 - accuracy: 0.9762 Test Accuracy: [0.05848436802625656, 0.9834615588188171]
Despite the evaluation showing a high accuracy, the plot of the accuracies indicate a problem.
The model is learning and generalizing well initially but begins to overfit after the second epoch, as the validation loss began increasing after that. The early stopping callback stopped training after two epochs of increasing validation loss which helped prevent further overfitting.
Generate the classification report and confusion matrix
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
#Predicting the classes for the test images
y_pred_classes = (model2.predict(test_images) > 0.5).astype(int)
# Printing the classification report will be useful too
print(classification_report(test_labels, y_pred_classes))
82/82 [==============================] - 7s 86ms/step
precision recall f1-score support
0 0.96 0.99 0.98 1300
1 0.99 0.96 0.98 1300
accuracy 0.98 2600
macro avg 0.98 0.98 0.98 2600
weighted avg 0.98 0.98 0.98 2600
cm = confusion_matrix(test_labels, y_pred_classes)
#Plot the confusion matrix using a heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.title("Confusion Matrix")
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.show()
Data Augmentation can help with the overfitting problem by creating new training samples with rotations, and other transformations of the image dataset. This could make the ML model more robust.
#Clearing the backend session
backend.clear_session()
# Setting the random seed to standardize outputs
np.random.seed(25)
random.seed(25)
tf.random.set_seed(25)
#Only using image data generator on the train data set, test data is untouched so the the model can be evaluated against real data
from keras.preprocessing.image import ImageDataGenerator
#Define parameters
datagen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True, zoom_range=0.5, rotation_range=45)
#prepare an iterators to scale images
train_iterator = datagen.flow(train_images, train_labels, batch_size=64, seed = 25, shuffle = True)
#Steps per epoch = no. of unique images in train data/batch size = 24958/64 (rounded up)
steps_per_epoch = 390
#Retrieving one batch of images
for X_batch, y_batch in train_iterator:
# Create a grid of 3x3 images
for i in range(0, 9):
plt.subplot(330 + 1 + i)
plt.imshow(X_batch[i].astype('float32'))
#To display labels
if np.argmax(y_batch[i]) == 0:
plt.title("Uninfected")
else:
plt.title("Parasitized")
# Show the plot
plt.show()
break
The images are now in irregular shapes and orientations, creating a more diverse training set of data.
#Reverting back to a model without Batch Normalization and Leaky Relu to avoid overfitting issue
#Reverting back to relu instead of tanh to reduce computational cost
#Maintaining sigmoid function in output layer
#Removing extra layers (using same amount of layers as base model) as there was no increase in accuracy from adding additional layers
#Define the model
model3 = Sequential()
#Add convolutional layers
model3.add(Conv2D(filters = 32, kernel_size = 2, padding = "same", activation='relu', input_shape=(64, 64, 3)))
model3.add(MaxPooling2D(pool_size = 2))
model3.add(Dropout(0.2))
model3.add(Conv2D(filters = 32, kernel_size = 2, padding = "same", activation='relu'))
model3.add(MaxPooling2D(pool_size = 2))
model3.add(Dropout(0.2))
model3.add(Conv2D(filters = 32, kernel_size = 2, padding = "same", activation='relu'))
model3.add(MaxPooling2D(pool_size = 2))
model3.add(Dropout(0.2))
#Flatten to convert 2D features to 1D vector for fully connect layers
model3.add(Flatten())
#Add dense layers
model3.add(Dense(128, activation='relu'))
model3.add(Dropout(0.5))
#Labels for this model are not One hot encoded as maintaining the binary 1D array is necessary for an output layer of 1
model3.add(Dense(1, activation='sigmoid'))
# Print the model summary
model3.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 32) 416
max_pooling2d (MaxPooling2D (None, 32, 32, 32) 0
)
dropout (Dropout) (None, 32, 32, 32) 0
conv2d_1 (Conv2D) (None, 32, 32, 32) 4128
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
dropout_1 (Dropout) (None, 16, 16, 32) 0
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
dropout_2 (Dropout) (None, 8, 8, 32) 0
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 128) 262272
dropout_3 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 1) 129
=================================================================
Total params: 271,073
Trainable params: 271,073
Non-trainable params: 0
_________________________________________________________________
model3.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Using Callbacks
callbacks = [
EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath='.mdl_wts.hdf5', monitor='val_loss', save_best_only=True)
]
Fit and Train the model
history3 = model3.fit(train_iterator, steps_per_epoch=steps_per_epoch, validation_data=(test_images, test_labels), callbacks=callbacks, epochs=10)
Epoch 1/10 390/390 [==============================] - 163s 414ms/step - loss: 0.5752 - accuracy: 0.6919 - val_loss: 0.2370 - val_accuracy: 0.9258 Epoch 2/10 390/390 [==============================] - 162s 416ms/step - loss: 0.2510 - accuracy: 0.9035 - val_loss: 0.1815 - val_accuracy: 0.9381 Epoch 3/10 390/390 [==============================] - 162s 415ms/step - loss: 0.2154 - accuracy: 0.9240 - val_loss: 0.1741 - val_accuracy: 0.9385 Epoch 4/10 390/390 [==============================] - 160s 409ms/step - loss: 0.1918 - accuracy: 0.9322 - val_loss: 0.1333 - val_accuracy: 0.9596 Epoch 5/10 390/390 [==============================] - 162s 416ms/step - loss: 0.1954 - accuracy: 0.9348 - val_loss: 0.1148 - val_accuracy: 0.9731 Epoch 6/10 390/390 [==============================] - 162s 415ms/step - loss: 0.1825 - accuracy: 0.9401 - val_loss: 0.1170 - val_accuracy: 0.9742 Epoch 7/10 390/390 [==============================] - 159s 407ms/step - loss: 0.1779 - accuracy: 0.9409 - val_loss: 0.1122 - val_accuracy: 0.9638 Epoch 8/10 390/390 [==============================] - 162s 415ms/step - loss: 0.1733 - accuracy: 0.9422 - val_loss: 0.0942 - val_accuracy: 0.9792 Epoch 9/10 390/390 [==============================] - 161s 413ms/step - loss: 0.1671 - accuracy: 0.9441 - val_loss: 0.0862 - val_accuracy: 0.9788 Epoch 10/10 390/390 [==============================] - 163s 418ms/step - loss: 0.1650 - accuracy: 0.9442 - val_loss: 0.0899 - val_accuracy: 0.9696
Plot the train and validation accuracy
#Plotting the train and validation accuracy
plt.figure(figsize=(8, 6))
plt.plot(history3.history['accuracy'], label='Train Accuracy')
plt.plot(history3.history['val_accuracy'], label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
A higher validation accuracy is a good thing in this scenario as the train set is inclusive of irregular images that have been augmented. If the model is able to maintain a higher accuracy on the untampered validation set, it means that the model's training on augmented data has made it more robust when looking at real images.
Plotting the classification report and confusion matrix
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
#Predicting the classes for the test images
y_pred_classes = (model3.predict(test_images) > 0.5).astype(int)
# Printing the classification report will be useful too
print(classification_report(test_labels, y_pred_classes))
82/82 [==============================] - 3s 38ms/step
precision recall f1-score support
0 0.95 0.99 0.97 1300
1 0.99 0.95 0.97 1300
accuracy 0.97 2600
macro avg 0.97 0.97 0.97 2600
weighted avg 0.97 0.97 0.97 2600
cm = confusion_matrix(test_labels, y_pred_classes)
#Plot the confusion matrix using a heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.title("Confusion Matrix")
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.show()
Now, let us try to use a pretrained model like VGG16 and check how it performs on our data.
#Clearing the backend session
backend.clear_session()
# Setting the random seed to standardize outputs
np.random.seed(25)
random.seed(25)
tf.random.set_seed(25)
from keras.applications import VGG16
# Load the VGG model
vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))
vgg_conv.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 64, 64, 3)] 0
block1_conv1 (Conv2D) (None, 64, 64, 64) 1792
block1_conv2 (Conv2D) (None, 64, 64, 64) 36928
block1_pool (MaxPooling2D) (None, 32, 32, 64) 0
block2_conv1 (Conv2D) (None, 32, 32, 128) 73856
block2_conv2 (Conv2D) (None, 32, 32, 128) 147584
block2_pool (MaxPooling2D) (None, 16, 16, 128) 0
block3_conv1 (Conv2D) (None, 16, 16, 256) 295168
block3_conv2 (Conv2D) (None, 16, 16, 256) 590080
block3_conv3 (Conv2D) (None, 16, 16, 256) 590080
block3_pool (MaxPooling2D) (None, 8, 8, 256) 0
block4_conv1 (Conv2D) (None, 8, 8, 512) 1180160
block4_conv2 (Conv2D) (None, 8, 8, 512) 2359808
block4_conv3 (Conv2D) (None, 8, 8, 512) 2359808
block4_pool (MaxPooling2D) (None, 4, 4, 512) 0
block5_conv1 (Conv2D) (None, 4, 4, 512) 2359808
block5_conv2 (Conv2D) (None, 4, 4, 512) 2359808
block5_conv3 (Conv2D) (None, 4, 4, 512) 2359808
block5_pool (MaxPooling2D) (None, 2, 2, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
#Lets choose upto block3_conv1, as further layers will take longer to fit due to significantly higher number of parameters
from tensorflow.keras.models import Model
output_layer = vgg_conv.get_layer('block3_conv1').output
#We dont want to train the pre-existing layers
vgg_conv.trainable = False
#Initializing out model to have the vgg upto the block3_pool layer
model4 = Model(inputs=vgg_conv.input, outputs=output_layer)
#Adding layers to our own model
model4 = Sequential([
model4,
Flatten(),
Dense(256, activation='relu'),
Dropout(0.5),
Dense(1, activation='sigmoid')
])
model4.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
using callbacks
callbacks = [
EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath='.mdl_wts.hdf5', monitor='val_loss', save_best_only=True)
]
Fit and Train the model
history4 = model4.fit(train_images, train_labels, validation_data=(test_images, test_labels), batch_size = 64, callbacks=callbacks, epochs=10)
Epoch 1/10 390/390 [==============================] - 778s 2s/step - loss: 8.0170 - accuracy: 0.7205 - val_loss: 0.4966 - val_accuracy: 0.8565 Epoch 2/10 390/390 [==============================] - 770s 2s/step - loss: 0.5482 - accuracy: 0.6939 - val_loss: 0.3484 - val_accuracy: 0.8958 Epoch 3/10 390/390 [==============================] - 759s 2s/step - loss: 0.4776 - accuracy: 0.7108 - val_loss: 0.4345 - val_accuracy: 0.9092 Epoch 4/10 390/390 [==============================] - 758s 2s/step - loss: 0.5369 - accuracy: 0.7106 - val_loss: 0.3711 - val_accuracy: 0.9158
The 'val_loss' at the 2nd epoch is 0.3484, but it increases to 0.4345 at the 3rd epoch and further to 0.3711 at the 4th epoch. Since the 'val_loss' does not improve for 2 consecutive epochs ('patience'=2), the EarlyStopping callback stops the training after the 4th epoch.
Plot the train and validation accuracy
#Plotting the train and validation accuracy
plt.figure(figsize=(8, 6))
plt.plot(history4.history['accuracy'], label='Train Accuracy')
plt.plot(history4.history['val_accuracy'], label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
High validation accuracy indicates that the model performs well on unseen data however the difference in the training accuracy indicates that there is a case of underfitting. The model is not learning effectively from the training data.
test_accuracy4 = model4.evaluate(test_images, test_labels)
print("Test Accuracy:", test_accuracy4)
82/82 [==============================] - 57s 694ms/step - loss: 0.3711 - accuracy: 0.9158 Test Accuracy: [0.3710765540599823, 0.9157692193984985]
Plotting the classification report and confusion matrix
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
#Predicting the classes for the test images
y_pred_classes = (model4.predict(test_images) > 0.5).astype(int)
# Printing the classification report will be useful too
print(classification_report(test_labels, y_pred_classes))
82/82 [==============================] - 56s 684ms/step
precision recall f1-score support
0 0.93 0.90 0.91 1300
1 0.91 0.93 0.92 1300
accuracy 0.92 2600
macro avg 0.92 0.92 0.92 2600
weighted avg 0.92 0.92 0.92 2600
cm = confusion_matrix(test_labels, y_pred_classes)
#Plot the confusion matrix using a heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.title("Confusion Matrix")
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.show()
The accuracy of the VGG pretrained model has dropped compared to other models. This could be due to not using more layers, however it is indicative that the pretrained image data is not related to the image dataset that we are training. There are a higher amount of false positives and false negatives in this model indicating that it is making more mistakes in classification of the cell images compared to other models.
#Chose the model with the best accuracy scores from all the above models and saved it as a final model.
final_model = history1
Model 1 (the one after the base model) has had the best performance compared to any other models created with an accuracy of 98%. The base model also had an accuracy of 98% however, when comparing the confusion matrix of both models, the base model has a higher number of false negatives(52) than model 1 (12). This is important in this scenario as a false negative of malaria detection can lead to a patient with malaria being misdiagnosed and not receiving the treatment required and so a minimal false negative rate is highly desirable.
Improvements that can be done:
It is likely that a pre-trained model that is specialised in cell images may result in a higher accuracy of this dataset e.g. the Keras R-CNN model that has been specifically trained to identify and classify a large number of cells.
Let's try the model1 architecture using the HSV images to see if there is a noticeable difference in performance.
#Clearing the backend session
backend.clear_session()
# Setting the random seed to standardize outputs
np.random.seed(25)
random.seed(25)
tf.random.set_seed(25)
#Define the model/ rename model1 to model_hsv
model_hsv = Sequential()
# Add convolutional layers, changed activation functions to tanh
model_hsv.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh", input_shape=(64, 64, 3)))
model_hsv.add(MaxPooling2D(pool_size=2))
model_hsv.add(Dropout(0.2))
model_hsv.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh"))
model_hsv.add(MaxPooling2D(pool_size=2))
model_hsv.add(Dropout(0.2))
model_hsv.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh"))
model_hsv.add(MaxPooling2D(pool_size=2))
model_hsv.add(Dropout(0.2))
#New layers
model_hsv.add(Conv2D(filters=32, kernel_size=2, padding="same", activation="tanh"))
model_hsv.add(MaxPooling2D(pool_size=2))
model_hsv.add(Dropout(0.2))
# Flatten to convert 2D features to 1D vector for fully connected layers
model_hsv.add(Flatten())
# Add dense layer
model_hsv.add(Dense(128, activation="tanh"))
model_hsv.add(Dropout(0.5))
#Changed the output activation to sigmoid as this is a binary classification problem (infected or not infected)
#Labels for this model are not One hot encoded as maintaining the binary 1D array is necessary for an output layer of 1
model_hsv.add(Dense(1, activation="sigmoid"))
# Print the model summary
model_hsv.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_4 (Conv2D) (None, 64, 64, 32) 416
max_pooling2d_4 (MaxPooling (None, 32, 32, 32) 0
2D)
dropout_5 (Dropout) (None, 32, 32, 32) 0
conv2d_5 (Conv2D) (None, 32, 32, 32) 4128
max_pooling2d_5 (MaxPooling (None, 16, 16, 32) 0
2D)
dropout_6 (Dropout) (None, 16, 16, 32) 0
conv2d_6 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_6 (MaxPooling (None, 8, 8, 32) 0
2D)
dropout_7 (Dropout) (None, 8, 8, 32) 0
conv2d_7 (Conv2D) (None, 8, 8, 32) 4128
max_pooling2d_7 (MaxPooling (None, 4, 4, 32) 0
2D)
dropout_8 (Dropout) (None, 4, 4, 32) 0
flatten_1 (Flatten) (None, 512) 0
dense_1 (Dense) (None, 128) 65664
dropout_9 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 1) 129
=================================================================
Total params: 78,593
Trainable params: 78,593
Non-trainable params: 0
_________________________________________________________________
model_hsv.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
callbacks = [
EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath='.mdl_wts.hdf5', monitor='val_loss', save_best_only=True)
]
history_hsv = model_hsv.fit(train_images_hsv, train_labels, validation_data=(test_images_hsv, test_labels), callbacks=callbacks, epochs=10)
Epoch 1/10 780/780 [==============================] - 167s 210ms/step - loss: 0.7204 - accuracy: 0.5201 - val_loss: 0.6854 - val_accuracy: 0.5450 Epoch 2/10 780/780 [==============================] - 190s 244ms/step - loss: 0.6871 - accuracy: 0.5513 - val_loss: 0.6837 - val_accuracy: 0.5554 Epoch 3/10 780/780 [==============================] - 175s 224ms/step - loss: 0.6752 - accuracy: 0.5742 - val_loss: 0.7069 - val_accuracy: 0.5046 Epoch 4/10 780/780 [==============================] - 144s 184ms/step - loss: 0.6714 - accuracy: 0.5807 - val_loss: 0.6920 - val_accuracy: 0.5362
test_accuracy_hsv = model_hsv.evaluate(test_images_hsv, test_labels)
print("Test Accuracy:", test_accuracy_hsv)
82/82 [==============================] - 3s 38ms/step - loss: 0.6920 - accuracy: 0.5362 Test Accuracy: [0.6920151710510254, 0.5361538529396057]
#Plotting the train and validation accuracy
plt.figure(figsize=(8, 6))
plt.plot(history_hsv.history['accuracy'], label='Train Accuracy')
plt.plot(history_hsv.history['val_accuracy'], label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
#Predicting the classes for the test images
y_pred_classes = (model_hsv.predict(test_images_hsv) > 0.5).astype(int)
# Printing the classification report will be useful too
print(classification_report(test_labels, y_pred_classes))
82/82 [==============================] - 3s 37ms/step
precision recall f1-score support
0 0.63 0.18 0.28 1300
1 0.52 0.89 0.66 1300
accuracy 0.54 2600
macro avg 0.57 0.54 0.47 2600
weighted avg 0.57 0.54 0.47 2600
cm = confusion_matrix(test_labels, y_pred_classes)
#Plot the confusion matrix using a heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.title("Confusion Matrix")
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.show()
The model using HSV images has performed much worse, with an accuracy of 54%. This indicates that using HSV images may not be the appropriate approach for training models to detect infected cells.
Model1 (using tanh activation and sigmoid function in the output layer) performed best with an accuracy of 98%. The pretrained model performed the worst with an accuracy of 92% however this could have been due to not implementing all layers as with more parameters and more epochs, there could have been an increase in predictive accuracy. At 98% accuracy, scope for improvement is minor but possibly achievable by using transfer learning from a pretrained model that is specialised to predict cell images.
I propose model1 to be adopted as it has the highest predictive accuracy and the fewest false negatives in the confusion matrix. This means that this model was able to classify 98% percent of the data accurately and within the 2% that it wasnt, the instances of not detecting malaria when in fact there is malaria is kept to a minimum (12 instances of false negatives) which ensures that patients who have malaria are very rarely misdiagnosed and hence can receive the medical attention they require.